
























ADA102991 



FOUNDATIONS AND CONCEPTS OF SOFTWARE PHYSICS 


1 


by 


/' / R. P.^ Kovach aOSA K. W.jKolence 

I j Apri& 1979 


Institute for Software Engineering 
P.0. Box 637, Palo Alto, CA 94303 




\ 


& 


/ / 


DISTRIBUTION STATEMENT A 
Approved for public roloasej 
Distribution Unlimited 




DTIC 

ELECTE 

AUG 1 4 1981 



D 


81 7 22V 124 











ABSTRACT 


FOUNDATIONS AND CONCEPTS OF SOFTWARE PHYSICS 


R. P. Kovach and K. W. Kolence 

vi' 

Kolence 1 s theory of the performance of computing systems is restated. 
In this development, the basic subsystems are reduced to two: logical 
subconfigurations and software units. These and the three fundamental 
variables, work, time and storage occupancy are all defined using as a 
basis a single logical construct, the set of instantaneous descriptions 
of instruction executions. Basic results are given for time relationships 
(utilizations and concurrency levels), work relationships (distribution 
numbers), and work and time relationships (absolute power and relative 
power) are derived. Interpretation of variables used in other approaches 
in software physics terms are indicated. 
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INTRODUCTION 


Software physics is a theory for characterizing the behavior of 
executing computing systems named and proponed chiefly by 
K. W. Kolence. Its fundamental concepts, basic results and a variety 
of applications have been given in a number of publications by him, 
(KOLE70, KOLE72, KOLE73, KOLE75a, KOLE75b, KOLE76). The fullest and 
most recent treatment of the theory and some of its applications is 
given in KOLE76. The main motivation of the book, and the theory 
itself, arises from the many problems inherent in capacity management 
functions in data processing organizations. At this time a consider¬ 
able body of work exists, based on software physics, in such areas as 
workload forecasting and capacity planning, equipment and configura¬ 
tion planning, performance management, cost accounting and charging 
policy and budgeting. So far this work has withstood well the many 
tests provided by the rough and tumble world of present day computer 
installations. 

The present paper is nothing more than a restatement of the 
nuclear theory as presented by Kolence. It is done in order to better 
meet the accepted requirements of theory constructions economy of 
assumptions and basic (undefined) terms, more compact and rigorous 
derivation of results and more logical and orderly development based 
on established principles and methods. 

In the present development only one (assumed) logical construct is 
required, namely, a complete set of complete instantaneous descrip¬ 
tions of instruction executions. From this, using set theoretic 
(including graph theoretic) methods, the two basic subsystems, logical 
subconfigurations and software units, are defined. Then the three 
fundamental variables, work, time and storage occupancy are defined. 
Relationships among these are determined by the structural properties 
of software unit/logical subconfiguration pairs. Kolence's notation 
has been modified slightly and extended. A few results, not pre¬ 
viously educed, are given. 
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The only other attempts at a coherent approach to this subject are 
queueing theory and what's now known as the operational analysis of 
queueing networks. In this paper there are a few anticipatory hints 
on interpretations of software physics variables and relationships 
leading to queueing theoretic results. This whole area is fully 
explored and developed by Traister (TRAI79). From the point of view 
of the queueing analyst the basic theory of software physics analyzes 
service times into their constraints. In a practical setting this is 
important, revealing what variables may be changed and what the con¬ 
sequences are for service times. The use of software work in place of 
service requests introduces a weighting factor to requests more 
accurately reflecting the demand placed on the system by requests. 

The introduction of system related clocks and the relationships among 
their timings adds a new dimension to the characterization of utiliza¬ 
tions, service times and response times. The indications at this time 
are that not only can software physics subsume queueing network 
analysis as applied to computer systems but that a whole new set of 
results will emerge from the combined approach. 







THE LOGICAL STRUCTURE OF COMPUTING SYSTEMS 


Figure 1 shows a conventional representation of a computer system 
configuration. There is a wide variety of charting conventions 
employed in drawing diagrams such as this but they all have the same 
essential properties, the differences arising from considerations of 
ease of maintenance of the drawing, symbol conventions, local tradi¬ 
tions and so forth. Such graphs are very useful for showing all the 
devices in an installation, the cabling, logical and physical 
addresses and similar hardware related data. 

When analyzing the dynamic behavior of computing systems, however, 
these diagrams may be deceptive because they ignore the influence of 
time and time sequence on the topology of the system. False assump¬ 
tions may be made unconsciously, leading to erroneous conclusions 
about the relationship between configuration and performance charac¬ 
teristics. In the example at hand, for instance, it appears that 
there are two paths between main storage and the disk drives in the 
second string, one via channel 1 and one via channel 2. At the time 
of any given execution, however, a drive may be connected to main 
storage by only one channel, either channel 1 or channel 2. With time 
taken into consideration, all such apparent alternate paths are 
mutually exclusive. 

It is convenient to think of the usual configuration diagram as 
the graph union of all possible paths that may occur in the course of 
some execution. With that in mind then, appearances notwithstanding, 
these configuration graphs are rooted trees. There is one path 
between any higher node and any lower one and there are relative roots 
at every level. Most important for present consideration is that they 
exhibit the upper lattice property, that is, every node contains or 
covers the properties of each node dependent from it. So, for 
instance, we say that channel 1 is a disk channel because it contains 
control units which contain disk drives. For the same reason, channel 
2 is a disk channel. Channel 3 is a tape channel because it contains 
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FIGURE 1 



















control units which contain tape drives. Channel 2 is a tape channel 
for the same reason. 

At this point we require a definition for a logical subconfigura¬ 
tion. This in turn requires a definition of a composition operation 
on graphs which we call a graft. To form a graft of several trees, 
take their graph union. If the resultant graph has a root then the 
graph is complete. For example, the union of: 



which is a graft. If the resultant graph does not have a unique root, 
then one is created and all the relative roots of the union are made 
immediate descendents of it. For example, if the union of: 



which produces the graft: 
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COMBINED EQUIPMENT CLASS AND CONFIGURATIONAL 
LOGICAL SUBCONFIGURATION 









A logical subconfiguration is a graft of configuration subtrees 
selected by either some set of properties that they contain or by the 
root nodes or, occasionally, by a combination of both. The most 
commonly used logical subconfigurations of the first kind are called 
equipment class subconfigurations. Some of them are illustrated in 
figures 2 and 3 based on the configuration of figure 1. Some examples 
of the second kind of logical subconfiguration, based on configura¬ 
tional characteristics are shown in figure 4. Note also that the 
logical subconfiguration made up of all channel subconfigurations 
would produce 0, the Input/Output subconfiguration. Figure 5 shows a 
combined case where the root nodes are specified (control unit) and 
the equipment class selector (disk) as well. 

Typically, the only subconfigurations with more than one level are 
the various I/O subconfigurations. These multiple levelled structures 
are supplied numerical identifiers, in the form of subscripts, by the 
conventional method used for numbering nodes of trees, one number for 
the first level, two numbers for the second level and so on. Thus 
drive number 4 on control unit number 2 on channel number 3 is 
<5 324< The control unit subconfiguration is designated and the 
channel subconfiguration, a^. This is, of course, the exact analog 
of the physical and logical addressing scheme used both in hardware 
the software systems. Note also that in the conventional configura¬ 
tion diagram certain devices (drives especially) may have more than 
one designation. This causes no inconsistency whatever because at any 
given time an action occurs on a device under only one designation. 

In other words, the actions going on under the several designations 
for the device are mutually exclusive with respect to time. Further¬ 
more, all of our instrumentation will so report it. The fact that a 
single hardware device has several logical designations has no effect 
on the logical structure, it only places a constraint on the domain of 
possible simultaneous events. 

The previous discussion and the figures mentioned there require an 
explanation of the Greek letters. Certain logical subconfigurations 
are so frequently used that it is convenient to have a compact nota¬ 
tion to refer to them. The equipment classes are indicated with a 
brief word or abbreviation which is self-explanatory, e.g., disk, 




tape, ptr, card, etc. Configurational logical subconfigurations are 
written as a single Greek letter, e.g., is the whole configuration, 
0 is the I/O subconfiguration and y is the CPU subconfiguration. For 
those subconfigurations where the root is actually a device, a Greek 
letter designates the subconfiguiation and the corresponding Latin 
letter the device which is the root node. E.g., is the channel 2 

subconfiguration and a^ is the channel device which is channel 2, 

* s the f i rst control unit subconfiguration in the ct^ subcon¬ 
figuration and b^, tlle control unit device which is its root 
node. 
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PROCESSORS AND STORAGES 


In the conventional configuration diagram of figure 1, with the 
exception of the main storage, all of the devices depicted are proces¬ 
sors. In general, all devices are either processors or storage. They 
are distinguished by their operational relationship in an execution: 
processors change the contents of storage. A rough rule for distin¬ 
guishing the two is if the device may be programmed (microprogrammed) 
then it is a processor, if its contents are changed by a processor, it 
is a storage. 

Clearly, processors contain storages and storages have built-in 
processors to control them. For instance, a channel may contain 
memories (registers) for such things as commands, addresses, error 
detection codes, status codes, etc. those memories may be manipulated 
by a microprocessor which itself contains scratch-pad memories and 
internal registers. This bifurcation can be recursively applied right 
down to the movement of electrons across molecular boundaries, if 
there is reason to do so. When we are examining activity at one level 
of processors and storages, the activity on storages by processors 
within them are considered internal, not part of the activity we are 
studying. Typically, internal activity is treated as a loss or 
degradation factor much like friction in mechanics. For example, if 
we are interested in a channel's activity in handling data transfers 
for a computation in the CPU, then the channel work involved in 
channel command processing, status checking and so on is considered 
internal work. If, on the other hand, we are interested in the 
activity on the status registers, address registers, etc. of the 
channel, then the activity of the microprocessor manipulating those 
registers is the external work and the activity within the micro¬ 
processor on its scratch pads and registers is internal work. These 
distinctions make the problem manageable, allowing us to isolate 
important variables at one level (ignoring the lower levels) and then 
reapply those concepts at lower levels much like the laws of mechanics 
discovered at the macroscopic level (ignoring friction) were applied 
to the molecular level to account for friction. 


- 13 - 






It is not necessary for a processor to be a physically 
distinguishable device, some free-standing box sitting out on the 
computer room floor. As long as a set of processor functions, exclu¬ 
sive of other processor functions can be distinguished, as long as a 
mental boundary can be put around a processor subsystem, then it may 
be considered a processor. So, for instance, a built-in channel, even 
when it shares circuitry with the CPU so that some of their execution 
times are mutually exclusive, is still a distinguishable processor. 
Furthermore, as we shall see, all of the relationships of fundamental 
variables that apply to processors individually also apply, with 
appropriate accommodation, to logical subconfigurations of processors, 
making them in effect logical processors. 

All of the common equipment classes may be classified readily into 
processors or storage. A partial list is given below for reference 
and classification. 

Processors; 

CPU 

Disk Drives 
Tape Drives 
Drum Drives 
Printers 
Card Readers 
Card Punches 
Paper Tape Reader 
Paper Tape Punches 
Terminals 

Mass Storage Drives 
MICR Readers 
A/D and D/A Converters 
Low Speed Channels 
(e.g., byte MPX) 

High Speed Channels 

(e.g., selector, Blk MPX) 

Disk Control Units 
Tape Control Units 
Drum Control Units 
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Storages s 

Main Storage (e.g., core, diode) 
Registers (everywhere) 

Disk (platters) 

Tape (reels) 

Drum 
Paper 
Cards 
Paper Tape 

Cartridges (mass store) 

Etc. 




<p 


Printer Control Units 
Card Control Units 
Paper Tape Control Units 
MICR Control Units 
Transmission Control Units 
Terminal Control Units 
Mass Storage Control Units 
Etc. 

The reader may have noticed that one whole category of device has 
not been mentioned so far, namely cables, transmission lines, busses 
and the like. From the point of view of configurations they are 
treated only as connectors, arcs between nodes. They only enter into 
consideration when their capacity has some limiting effect on the 
performance characteristics of processors. Analysis of their capacity 
is most appropriately handled by considering them as channels in the 
information theoretic sense. 










SOFTWARE UNITS 


In describing the properties of an executing computing system or 
any of its subsystems it is necessary to specify not only the hardware 
constituent (device or subconfiguration) but also a "software" con¬ 
stituent. Clearly, this is so because a processor event consists of 
the execution of an instruction by a processor and the kind of proper¬ 
ties we are interested in may be conditioned by the instruction and 
therefore, sets of events by sets of instructions. 

The usual methods of designating or categorizing software are not 
adequate for our purposes. Most of them imply identities or equiva¬ 
lencies which simply will not hold in the situations we are 
describing. For instance, if the software constituent were defined to 
be a program in its source language state then a program compiled and 
executed on two different computers would be considered the same, in 
some regards at least. Similarly, a program run through two different 
compilers and executed on one machine or two different executions of 
the object code on one machine would be considered identical when, in 
fact, they might (rather, probably would) produce different series of 
processor events. Furthermore, such approaches limit scope, prevent¬ 
ing us from talking about execution characteristics of sets of 
instructions across program boundaries. 

To arrive at a more satisfactory definition we resort to the 
following heuristic device. Assume the existence (or the possibility 
of existence) of a complete collection of instantaneous descriptions 
of all instruction executions over some time interval, in the manner 
of an instruction execution trace. Each description contains a 
variety of information such as all relevant addresses of the instruc¬ 
tion (absolute memory address, offsets from base registers, from page 
start, page number, etc.), the instruction itself, all operand 
addresses, the operands themselves (data) , the time at the start and 
end of the execution and whatever else is required. More precisely, 
we should say assume that the information provided by such a trace is 
known. Now we are in a position to define a software unit . 






! 

A software unit is any subset taken from the complete collection 
of instruction-operand executions over some time interval. 

Note that this means the processor events, the execution of 
instructions operating on operands and those operands, not some 
selected lines from a print-out. 

The software unit (subset) consisting of the full set for some 
time interval is called the full workload for that interval and is 
denoted by L. If the time interval is partitioned into several 
intervals, then full workload for each subinterval is called a sub¬ 
workload, designated as L. and L = II L.. 

A i 

Obviously, any software unit may be decomposed into other software 
units and the union or intersection of any two software units is also 
a software unit. Some decompositions or compositions are more usually 
employed than others. Most often software units are partitioned: 

S * U S. where S. fl S. = 0 for i ^ j. 
i 1 

Some common partitionings are into software units where each unit 
is all the executions of a given instruction or class of instructions, 
into applications, jobs, and program runs (discussed below), into 
subworkloads as described above or into other partitions within sub¬ 
workloads and so on. Situations arise, especially in measurement 
experiments and in accounting procedures, where a decomposition is not 
known with certainty to be a partition. It is necessary to be aware 
of this fact so that false conclusions about properties of the aggre¬ 
gation may be avoided. 

In a deterministic machine a software unit is completely 
determined by a set of instructions (program) and the full set of 
initial operand values (inputs) to them. Therefore, it is possible to 
treat the pair instruction-set/inputs as equivalent to a software 
unit. In practice, the only situations where both may be precisely 
specified is for applications, jobs, steps, program runs and the 
like. On occasion, following common parlance, expressions such as 
"sort work" and "work of the operating system" are used. The first of 
these means the work of the software unit made up of executions of the 
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sort program against some known set of inputs (last month's invoices, 
etc.) and similarly for the second. What is not meant, most 
emphatically, is that the sort program or the operating system program 
is a software unit. 













NOTATION 


In analyzing the behavior of executing computing systems we need 
to characterize sets of processor events and their consequences. A 
processor event always requires a "linked pair", an element of a 
software unit (executed instruction and operands) and a processor, 
including subconfigurations. To specify sets of events we must 
specify software units on sets of processors. Notationally, this is 
handled by using functional notation as it is commonly used in proba¬ 
bility theory and combinatorial analysis. Software physics variables 
will generally be of the form: 

F(software unit list; processor list) 
where F is the functor, the two lists are separated by a semicolon and 
the elements of each list are separated by commas. Note that F is not 
a function of the software unit and subconfiguration. It is a vari¬ 
able or a value over the range specified in the two lists. The lists 
specify to what the variable or value pertains. 

Certain classes of software units and certain logical 
subconfigurations are so frequently used that a standard set of 
symbols for them has been established for the sake of brevity and 
convenience. As described before, L is the software unit (subset) 
made up of all the instruction-operand executions for some period of 
observation. If L is partitioned by time, then the subworkloads are 
designated by L., i ® 1, 2, 3 . . .. S is the symbol for any soft¬ 
ware unit whatsoever (including L) and S^, i = 1, 2, 3 . . . is any 
subset of S. 

Usually we will be interested in partitions of S, that is, where 
(1 Sj * 0 for i / j. If we need to distinguish hierarchies of 
subsets (partitions), the subscripting will be handled as it is for 
subconfigurations, described below. 


The most common subconfigurations referred to are the whole con¬ 
figuration, ip> the I/O subconfiguration, 0, the CPU subconfiguration, 


Y, channel subconfigurations, control unit subconfigurations, 

8^ and drives <5^^. For those subconfigurations which are 
hardware subconfigurations, that is, where the root node is an actual 
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device, it is desirable to have a distinct, but related, symbol for 
the device. The most usual are a. for channels and b.. for con- 
trol units. The drives, being leaves of the tree, are degenerate 
cases of subconfigurations where the drives and the subconfigurations 
are one and the same thing, so no distinct symbol is required for the 
device. 

In many cases it is necessary to indicate several software units 
or several subconfigurations (or both). Furthermore, it is always the 
case that an item in the list is contained by another item in the list 
(except the last, of course). Our convention will be to write the 
lower level or smaller item to the left. So now the format of a 
typical variable is: 

F(software unit, containing S.U., . . .; subconfiguration, 
containing subconfiguration, . . .) 

For example, we may be interested in the following utilization ; 
tHS^Sry,^). This means the ratio of the time that software unit 
is executing on the CPU with respect to the time the containing 
software unit S is executing on the full configuration. 

The symbol for any subconfiguration whatsoever is X (chi). In 
writing many expressions it is necessary to not only indicate that one 
subconfiguration contains another but also how many levels away the 
other is. This is handled by the device of using an # in the sub¬ 
scripts. An asterisk means a string of one or more subscript values. 
The usual rules for substitution apply. Once a string of subscripts 
has been assigned to # then it must be substituted uniformly 
throughout the expression. Thus, if we write an expression about X 
and X* then we are talking about any subconfiguration and any 
subconfiguration contained in it. If we write about X and X»* then 
we are talking about any subconfiguration at least two levels below 

X- If we write If(S;x* 4 ,X) then we are talking about the sum 
j J 

of the F's for all subconfigurations at some level which is at least 
two below X* and so on. If it is required to deal with hierarchies of 
software units (especially partitions) then the same mechanism may be 
employed. 








V 


FUNDAMENTAL VARIABLES: TIME 


Extent in time is the most generally perceived property of 
processor events. So much so, in fact, that it is commonly and 
erroneously used as a measure of the amount of activity of computing 
systems. On top of this there is considerable confusion over appro¬ 
priate measures of time, especially for subconfigurations, and the 
relationship between device times and subconfiguration times. Clarity 
and precision in concept and definition of time is absolutely 
essential if the theory is to be fruitful and not break down at some 
later point in the deduction process. 

The most basic measure of time is that of the time it takes a 
device to execute an instruction. This includes all of the time 
required by that device alone to perform the instruction, i.e., it is 
not just transfer time. It does not include preparation time, 
hold-off time or blocking time due to some other (perhaps internal) 
device even though the subject device is "connected" or signalling 
"busy." To illustrate, the execution time for a disk instruction 
includes the seek time, search time (rotational delay, latency) and 
the transfer time. It does not include the time after the seek is 
completed during which the search cannot start (if any), commonly 
called seek-delay, or the time consequent on RPS "misses" despite the 
fact the the drive is "busy" throughout all those times. Seek delay 
and RPS miss time are called delay times because they are due to some 
other drive blocking at the control unit or channel level. They are 
not due to the execution of the observed drive. 

The execution time of device Y for software unit S is the total of 
the instruction execution times on Y of the instructions of S for Y. 
The functor for execution time is the symbol Tx and the above state¬ 
ment is written Tx(S;Y). On occasion, we shall need to discuss delay 
times and busy times. These are symbolized as Td and Tb respectively 
and the three -are related by 

Tb(S;Y) = Tx(S;Y) + Td(S;Y). 
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We extend the definition of execution time to subconfigurations by 
appealing to a concept common in modern physics: time changes in a 
system only when an event occurs in that system. Translated into 
software physics terms the definition states that execution time for a 
subconfiguration increases when any constituent of the subconfigura¬ 
tion is in execution. An equivalent, and perhaps more intuitively 
appealing definition may be made by using the notion of associated 
clocks. Each device has a clock and that clock runs whenever that 
device is executing. Associated with each node of every logical 
subconfiguration is a clock that runs when the clock of the device at 
that node (if there is one) is running or when the clock of any 
contained node is running. This approach has the same recursive 
character as the first one. 

Because of the possibility of simultaneous activity at lower 
levels it is evident that the execution time of a subconfiguration is 
not, in general, the simple sum of the execution times of lower level 
subconfigurations. In fact, 

Tx (S; X) < I Tx (S j Xi ) » 
j 3 

which also may be applied recursively. Likewise, dually, 

Tx(S;X) < l TX(S.;X) , 
i 

even when the constitute a partition. To find a rule of compo¬ 
sition for execution times, we must go back to the basics. Indicate a 
point in time when Xj is executing by tx(S;Xj)* An instruction 
execution, somewhere in Xj > generates a continuous point set, map- 
pable to the real line, over some interval I. Indicate that point set 
by 

(txJSjXj) }j. 

The amount of time to perform the instruction is a measure function on 
that point set (e.g., upper bound minus lower bound) indicated by 

({tx(S;Xj)}jl• 
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This in turn leads to 


Tx(S;x.) * l [{tx(S?x i )} I l* 

which is to say that the execution of X- for S is the sum of the 
time interval lengths for the execution of instructions from S for 
Xj/ as we said above. 

Now, if we have several subconfigurations, Xj (for various j), 
then the union of their point sets will create a new set of intervals, 
K, and we have 

Tx(S;X) - I [{“{txfSjx*)}}-!. 

K D 1 * 

This is the rule of composition we were seeking. 

That forbidding looking nest of brackets is really asserting 
something that is in the realm of common sense. Some examples from 
figure 6 will make the point clear. Figure 6 illustrates an execution 
time pattern for three mutually exclusive software units, called S^, 

S 2 and S 3 , on a configuration consisting of one CPU, one channel 
with two control units, each with three disk drives. The whole time 
span has been divided into 100 units to permit times to be equated to 
ratios (percents) readily. The first band of three lines shows the 
execution time pattern of the three software units on the CPU. The 
next band of three lines shows the execution times of the channel and 
control unit devices. The next six lines show the time patterns for 
the drives, the labels above each segment showing the software unit 
associated with that execution. Several bands further down are the 
execution patterns for the two control unit subconfigurations and the 
channel subconfiguration, Tx(L;S^) f Tx(L;8 12 ) and Tx(L; 0 ^). 

The execution pattern for 5^^ is a broken set of intervals 
whose lengths total 55 units. Likewise, the sums of the interval 
lengths for 6^ and <5^ 2 are 30 and 32.5 respectively. Now, form 
the "union" of those execution patterns (imagine the "ceiling" is 
lighted and look for the shadows on the "floor") and we get the 
execution pattern of s u- which is just three execution intervals, 









totalling 85 units. Do the same for the other three drives and , 
producing one long execution interval of 80 units. Do the same with 
either all 6 drives or the two 6 execution patterns and produce the 
pattern for a^. This, in a nutshell, was what the previous discus¬ 
sion was describing. 



FIGURE 










One other observation should be made at this point. If the times 
shown for the drives were busy times rather than execution times, that 
is, included seek delay and RPS miss time as well as execution time, 
it would have no effect on the time pattern for the 6 or x configura¬ 
tions. In other words, 

TbfLjB^) * Tx(L;6 11 > and TbfL;^) = Tx (L; ) . 

This is so because the delays experienced by one drive are due to the 
execution of some other drive. There is another class of time measure 
commonly encountered which should be accounted for in our system, 
elapsed time and the closely related idle time. It is clear from the 
definitions that idle time for a subconfiguration will not be 
registered on its own clock since that clock is only running when the 
subconfiguration is executing, i.e., not idle. Since elapsed time 
would be identical to execution time unless provision were made for 
intervening intervals of idle time, the same argument holds for 
elapsed time as well. Therefore, both must be measured with respect 
to some higher clock. Elapsed time for a subconfiguration x* is the 
difference, on the clock of x» between the beginning of the earliest 
execution (least lower bound) and the end of the latest execution 
(greatest upper bound) of x*« Elapsed time for subconfigurations is 
written Te(S;X*,X)- There is an analogous definition for the soft¬ 
ware unit dual, written Te(S^,S?x) and the composite case is 
Te(S.,S;X*X) • 


One last definition, to specify the upper bound to the process: 

Te(L;i|/,<Jj) * Tx(L;ip) . 

This assertion not only provides closure but it also states that there 
is no higher clock than that of (L;$). In other words, idle times for 
the entire system are excluded from consideration. Periods when the 
whole system is idle because of lack of jobs, schedule problems, power 
outages, machine or software breakdowns or because the installation is 
closed for week-ends and holidays may be of interest or concern to 
management but they have nothing to do with a theory of executing 
computing systems. 

The illustration of figure 6 should clarify some of the 
relationships discribed above. For example, Te(L;5j^,B^) * 65 
because the beginning is at 30 and end at 100 but there are five units 
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wherein the clock of 8^ is not running (ending at 70 and 80). 
However, Te (L;*^^ ,01^) » 70. Also, Te(S^,L;<S^ 21 > * 65 but 
TefS^L.-Sj^) =■ 72.5. 

It is the usual, if not universal, practice to only use the 
highest clock, that of (Lf’p) and to talk of elapsed time only with 
respect to software units. Why this should be, other than the force 
of habitual thinking, is not clear. 
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LOGICAL SUBCONFIGURATIONS AND SOFTWARE UNITS: DUALITY 



-2 






















while 0 for is: 



8 


12 


and so on. 


If S contains S i then a subconfiguration X/S contains x/ s i 
just as subconfigurations based on devices do, e.g., if x contains 
X* then x/S contains X*/S. Therefore, any theorems about proper¬ 
ties of executing systems arising from containment of device based 
subconfigurations will be true, suitably rewritten, for subconfigura¬ 
tions based on software units and subunits. This will be, in fact, 
most theorems. 


It may have been noticed that subconfigurations based on software 
subunits have the same height as the containing subconfiguration. 

This will always be the case. Subconfigurations contained in device 
subconfigurations have a lower height. The first variety are formed 
by pruning leaves and branches, the second by removing roots. If 
these actions are considered as operations, then we can talk of the 
composition of the operations. Clearly, the order of performing the 
operations is immaterial, yielding the same resultant subconfiguration. 






TIME RELATIONS: UTILIZATIONS AND CONCURRENCIES 


Given that we have clocks at every node of every logical 
subconfiguration, it is natural to relate the readings of one clock to 
another. Two of the most common classes of relationship are the time 
on a clock with respect to that of a containing node and time on a 
clock with respect to that of a similar node in another subconfigura¬ 
tion. The first relationship yields utilization numbers and the 
second time-balance ratios, which can be derived from utilizations. 


The definition of utilization for subconfigurations is: 

Tx(S; x*) 


U(S;X*,X) = 


Tx {S; x) 


This is the utilization of X* with respect to x* the portion of the 
time that x is executing when x* is also executing. On those 
occasions where we need to talk about utilizations based on busy time, 
Tb, or delay time, Td, then we will write Ub and Ud respectively. U 
is understood to mean Ux. Some examples from figure 6: 


°' L,s uo- 8 u > - If - - 65 


U < S 2? ^Xl.2' ^11^ 


32.5 

35 


.93 


U(L;<S 121' a l ) 


72.5 

90 


.81 and so on. 


The dual for software units is: 

U<S.,S; X ) 


Tx (S i ? x) # 
Tx(S?x) 


this measure is commonplace in practice where x is the cpu and s the 
full workload, i.e., 


U(S.,L;y) 


Tx(S i ;y) 
Tx (L; Y) 
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Examples from figure 6: 


□ (S 3 ,L,8 u > - H - .35 

U(S 1 ,L;i|>) * ^ =* 1.00, etc. 

The utilization resulting from the composition of operations is: 

0(S.,S; X *,X> “ 

Tx(S? X ) 

that portion of time when S is executing on x that is executing 
on x*- Now, if we multiply the right side by: 

Tx(S i ;X) 

Tx(S i ;X) 

and rearrange factors, we have: 

Tx(S i :X*) Tx(S i ;x) 

U(S ± ,S;X*,X) * Tx(S^; X ) ' Tx(S; X ) 


= U(S.;X*,X) • U(S ,S;X) 

1 1 

Tx(s•y ) 

Or, we can multiply by * * and have: 

Tx(S; X *) 

U(S ,S; X# ,X) 3 TX(S i ;X * ) . TX(S;X * } 

Tx(S; X*) Tx(S; X ) 

- U(S.,S;X*)*U(S;X*,X) 

Rearranging the factors in one of these and comparing the two 
emphasizes the dual symmetry. Again, examples from figures 6: 

U(S 3 ,L;B 12 ,a i ) = ^ = .42 

~ - 94 


U(S 3 ,L;a x ) - = .44 

and .42 = .94 x .44, etc. 


Because of the ratio definition there is an evident "chain rule" 

U<S; X **, X ) * U < S ''X**'X.>‘U(S;X*,X) 

And dually for software units. The reader can find examples in the 
illustration. These obvious identities have considerable practical 
value, permitting us to measure some of the values and derive the 
others. Usually, certain measures are more difficult to obtain than 
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others so that with these identities we may avoid the need for taking 
the more difficult measurements. 


Numbers of considerable interest arise when we take sums of 
utilizations such as: 

I U<S ? x*,X> where the £ 

* * 


means the sum over the range of the rightmost subscript in it . This 
is the average number of X*' s executing over the execution time of X 
or, in short, an average concurrence. This becomes apparent by simply 
expanding the definition: 


M <S; X *'X) * Iu<S;x*,X> 

★ 


y Tx(S;X*) 
* txTsTxT 


l Tx(S; X *> 

s * 

Tx(S;X) 


i.e., the product of M(S;X*#X) and Tx(S;X) yields the sum of the 
execution times of the X*' s so that M is the average multiplicity of 
X* executions over the time Tx(S,*x). 


M < S ;X*'X> i- s a measure of concurrency of execution of processor 
subconfigurations which is commonly called the level or degree of 
multiprocessing. The software dual, M(S^,S;X) is the level of 
multiprogramming. Consequently, Kolence (KOLE76) refers to them 
collectively as MP levels. The composite case is the combined 
multiprogramming-multiprocessing level: 


M(S i' s ’X*'X> 


l Tx(S ? X*) 
i»* 

Tx(S; x) 


where, of course, the are mutually exclusive, as they are for the 
simple multiprogramming case. 


Often, and often inadvertently or unwittingly, utilizations and MP 
levels are taken with respect to busy times: 


Ub(S; X *,X> 


Tb(S;X*) < 
Tx(S;X) 
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Since Tb(S;x„) - Tx(S?x*) + Td(S;X*), 

l Tx(S ?X *) l Td(S ?X *) 

* 1 _ + 1 _ * M(S; X *,X> + Md < s ?X*'X> 

Tx(S;x) Tx(S;x) 

which is hardly surprising (the average of sums is the sum of 
averages). However, under interpretation, this does provide an 
interesting "cross-over" to results in the operational analysis of 
queueing networks. 


M ( S ?X*/X) is the average number of requests being simultaneously 
serviced within x (by the X*' 3 ) and Md(S?x*,X) the average number 
of requests waiting for service within x in RPS or seek-delay 

states). Note that this does not include those requests enqueued 
before X* Therefore, Mb(S;X*»X)» consistently interpreted, is n , 
the number of requests in queue, within x* (Here, and below, we use 
the notation of Buzen and Denning for variables from the operational 
analysis of queueing networks: 


n x is the number of requests enqued on and 
being serviced by x* 
is the throughput rate of X* 
is the average service time of x* 


In the present context the use of the two notations is unfortunately 
confusing. Nonetheless, it seems to be the best way to bring out the 
comparability of the two approaches. If now we introduce C, the 
number of completions in time Tx(S?x) then the average busy time for 
the X *' 3 i s: 

I Tb(S;X*) 

Tb(S;X*) - *_. 

C 


Now, since: 

l Tb(S>x*> 

Mb(S;X*,X) - 1 _» 

Tx(S;X) j 

i 
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it follows that: 


Tb(S;x*)*C * Mb(S;x*/X)* Tx ( s ?X) or ' 


Tb (S;X) » Mb(S;X«,X)* TX( g f) ^ . 

The last factor is average service time S x . This implies that 
Tb(S;X) is the internal response time (not counting queue-time before X) 
R x# because if R x =* n^'S^, as the above implies, and if Little's 
Law, n x = R x* X x ' where X x is the throu 9bput rate of X» then 


R 


x 


R -X *S 


or 



a fundamental result (or definition) of queueing network analysis. 


The discussion and demonstration given above has significance not 
just because it shows the isomorphism of the two approaches, under 
proper interpretation, but because it suggests extensions in the 
results of queueing network analysis by virtue of the software unit 
dualities that arise from the software physics approach. This whole 
matter, the subsuming of queueing network analysis by the methods of 
software physics, is analyzed and developed by Traister(TRAI79). 

MP levels based on execution time, which are a consequence of 
opportunities for parallelism inherent in the configuration and the 
patterns of request arrivals, are a measure of how well the software 
units exploit the configuration or, conversely, how well the configur¬ 
ation meets the requirements of the software units. MP levels based 
on busy time show the concurrency of waiting as well as executing. In 
fact, beyond a certain point, as Mb increases, Md increases more 
rapidly than M, that is, there is more "improvement" in the overlap of 
waiting than there is in processing. To measure only Mb without 
obtaining either Md or M is to know only a part of the performance 
characteristics. 

Another set of useful identities arise when we apply the chain 
rule for utilizations to MP levels: 
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M<S;X**,X) 


U(S;X*»X)‘M(S;X**,X*) 


and so on. 

Ratios of device times to subconfiguration times arise in the 
analysis of delays due to blocking, especially at control units and 
channels where connect time is less than drive execution time, as is 
the case with disk drives. For example, since U(L;b,B) is the proba¬ 
bility that there is a path block at the control unit when some drive 
is executing it figures directly in the creation of seek delays and 
RPS misses. 

Ratios of utilizations, such as: 

U(S?X*,X> s Tx(S?X*) 

U(S;XJ,X) T xIsTxjJ 


are time balance numbers. They are frequently of interest to config¬ 
uration designers and system tuners. They are not the only variety of 
balance indicators, however. 




FUNDAMENTAL VARIABLES: WORK 


A second fundamental variable is required for the theory to 
quantify the activity of software units and processors, a measure of 
how much was effected by them. Within appropriate constraints, this 
measure must be invariant with respect to time. For instance, if we 
executed a program against a set of inputs (which determines a soft¬ 
ware unit) on some configuration and then made a modification which 
doubled the basic cycle rate of some of the processors and then 
executed the same program against the same inputs again, the measures 
of activity must be equal. Similarly, if we execute the same 
program-input pair on two machines from the same "family", having 
identical instruction sets, then the measures must be equal even 
through there may be considerable difference in the internal opera¬ 
tions of the two. 

This measure is software work. It has the same basis as work in 
any branch of physics. Work is associated with changes in state. 

Work is performed whenever there is a change of state: the greater 
the state change the greater the amount of work and, in discrete state 
spaces, the minimum amount of work is associated with a minimum state 
change. 

Kolence (KOLE76) develops the definition in the following way. 
Storages are made up of sets of n-bit containers. The n-bit string in 
a container at any instant in time is a symbol from an alphabet of 
2° symbols. At every instant in time every container is in a symbol 
state. A unit of work is performed when the symbol state of one 
container is changed. A "standard" container size of 8 bits is arbi¬ 
trarily chosen and, following common usage, both the container and its 
symbol are called a byte. The working definition then becomes: A 
unit of work is performed when a processor changes one byte of 
storage. Since, in practice, our instrumentation will not permit us 
to see the "before" and "after" values of every processor action on 
every byte of storage, the "laboratory" definition becomes: One unit 
of work is performed when a processor transfers one byte to a stor¬ 
age. Clearly, on average, this unit is less than one of the units 


in the basic definition because sometimes the byte transferred will be 
the same as the byte that was in the storage. 

We have now defined work and a measure for work by processors. It 
remains to define them for software units and logical subconfigura¬ 
tions. The work done by an instruction-operand pair is the work done 
by the processor in executing the instruction on the operands. The 
amount of work done by a software unit is the sum of the amounts of 
work done by the instruction-operands that make up the software unit. 
Work is done by a subconfiguration if work is done at either end of a 
path going through the subconfiguration. In other words, a S subcon¬ 
figuration, for example, performs work when there is a write to one of 
its drives or when there is a read from one or its drives (which 
changes main storage, which is outside of the 8 subconfiguration). At 
one blow, this definition does three things: 1) it defines the work 
of a subconfiguration as all of the work that "emanates" from it, in 
either direction, 2) it preserves the containment property for work 
and 3) it partitions subconfigurations. A partition of a logical 
subconfiguration is a set of subconfigurations such that their graft 
is the subconfiguration and such that each "leaf" of the subcon¬ 
figuration appears in just one member of the set. (A physical leaf 
having two designations may appear in two members, once under each 
designation, because the actions of the two logical units are mutually 
exclusive over time.) 

We write the work done by software unit S on subconfiguration X as 
W(S;X), A unit of work, called simply enough, a work is indicated by 
w. Metric prefixes are normally used to indicate larger quantities of 
work, e.g.: 


1000U = 1 Kilowork =» 1 KJ; 1000 YU = 1 Megawork = 1 MW; 
and 1000 MJ = 1 Gigawork = 1 QJ. 

It follows directly from the definitions given above that 

W(S;X) - l (S. ;X) , 


- 37 - 




where the are a partition, and 

W(S;x) - l (S;X*), 

* 


where the X* are a partition of X« In particular, 

W(S;l P) = W(S;Y) + W(S;0) 
W(S;0) = l W(S;X i ) 


W(S;X i ) = I W(S;S..), and 
•i J 


W(S;6. j) 


I W(S;<S_ k ) » and substituting back 


W(S;ifi) = W(S;y) + £ W<S;<$ ) * 

i,j,k y 


In other words, the total work by S is the simple sum of the CPU work 
and all the drive work. Another such relationship based on a partition into 
equipment class subconfigurations often turns out to be very useful: 


W (S j if/) =■ W(S;CPU) + W(S;disk) + W(S?tape) + W(S;ptr) + . . . 

This property of software work, the whole is equal to the sum of its 
parts no matter how the whole is partitioned, is called the extensive 
property. For the purposes of theory construction it is a very valuable 
property simply because it permits us to do ordinary arithmetic when dealing 
with the variable work or functions of it. 


/ 



WORK AND TIME RELATIONSHIPS: POWER 


In every branch of physics there arises naturally a variable which 
is the rate at which work is done over time. It is called power and 
is defined 


P 


dW 

dT' 


or more simply, 



which is the average power over the interval T. Since work and time 
are now defined in software physics, the concept of software power 
also arises naturally. 


Stated in its most general terms, software power is defined by the 
relationship: 

P(S ,S;X*,X> = W(S j‘ X * ) . 

Tx(S;X) 

In the case where S i * S then we have the subconfiguration power 
relationship: 


?(S;X*rX) 


W(S;X*) 
Tx(S; X) 


In the case where X* = X we have the software unit power relationship 


P(S.,S;X) 


W(S.?X> 
Tx(S;X) 


All of these cases show the work done for some software unit by some 
subconfiguration relative to some higher clock. They are, therefore, 
relative powers. 


In the case where both the software units and the subconfigura¬ 
tions are equal we have 


P(S;X) 


W(S;X) 

Tx (S; X) ' 
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the work of (S;x) with respect to its own clock. This is called 
absolute power. It is absolute only in the sense that it is not 
relative. No implication of "ultimate" is intended, for absolute 
power may be varied, especially in the case of I/O. 


A relationship of fundamental importance is revealed by the 
following sequence of rewritings: 


P(S;X*,X) 


W(S;X*) Tx(S;X*) 
Tx(S;X) *Tx(S;X*) 


= W(S;X*) Tx(S;X*) 
Tx(S;X*)*Tx(S;X) 


= P(S;X*) *U(S;X*X) 

i.e., the relative power equals the absolute power of the lower 
subconfiguration times the utilization. Dually, for software units, 
we have 


P(S.,S; X ) * P(S. ? x) *U(S.,S;x) • 

When required, the chain rule for utilizations may be employed, such 
as = P< S -*X**'X> = p ( s ?X**) *U< S ?X**'X*) *U(S;x*rX> and s° on. 


Now, observe 


P(S;X) 


W(S;X) 

Tx(S;X) 


= y W(S.;X) 

i Tx(S;X) 


l P(S.,S:X> 


where the are a partition of S. Dually we obtain 

P(S;X) * l P(S?X*rX) 

* 


where X*’ s are a subconfiguration partition of X* In other words, 
the absolute power is the sum of the relative powers of any partition. 


The above manipulations suggest that the same sort of thing might 
be done using ratios of work rather than ratios of time. The ratio 

W(S., X ) 

W<S;X) 
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will be written D(S^,S;x) because it is a distribution number, as 
will become apparent presently. Now, P(S^,S;X) * D(S^,S;X)*P(S?X) 
and dually, P(S;X*,X) s D(S;X*,X)*P(S;X)• In other words, the 
relative power equals the absolute power of the higher subconfigura¬ 
tion times the distribution. We call these distribution numbers 
because 

l D(S;X*,X) = l W(S;X * ) - 1, 

* * W(S;X) 

that is the D's are such that 0 < D. < 1 and 

l - 


I D. = 1, 
. l 


which are the properties of a frequency distribution. 

I P(S;X*»X) = I D(S;X*,X> *P (S; X) * P(S;X) 

* * 

which brings us full circle. Of course, there is also a chain rule 
for distribution numbers, e.g., 

D(S;X**,X) * D(S;X**,X*)«D(SjX*,X). 


One final set of observations should be made, relating the 
absolute powers, relative powers, utilizations and distributions: 

P <V S -X*X> - D<S.,S; X *)-U(S ? X*,X)-P(S;X*> 

or = D(S l J X *#X)*0(S i ,S;x)«P(S i , X ). 

The earlier observations, where either S, = S or x* = X' can be 
derived from these last two by simply substituting unity for the 
appropriate D's and U's. The reader can discover for himself what 
summing the first equation over i and summing the second over * 
produce. 


It may be helpful at this point to discuss the significance of 
these results, in the sense of understanding their impact, relating 
them to other approaches or intepretations. We have already remarked 
that utilization in software physics corresponds to utilization in 
queueing theory. Now let us consider absolute power. P(S;X) is the 






rate at which X produces work when it is executing. It is a service 
rate. Put another way, 


1 _ Tx(S;Y) 

P(S;X) W(S?X) 

is the time for x to complete a unit of work when x is working. It ii 
the service time of x- Relative power, P(S;X*,X)» on the other 
hand, is the rate at which x* produces work over a larger time, an 
observation time, Tx(S;x>- It is, therefore, the throughput rate 
(over the time Tx(S;x)). Recalling and rewriting one of the first 
results in this section: 

U(S;X*,X> - P(S?X*.X>* p ( sV x) ~ 
or in the Buzen-Denning notation U * X *S , which is the 

XXX 

utilization law of operational analysis. 

Distribution numbers do not have a simple corresponding 
operational analysis. They resemble in some ways visit ratios and in 
other ways routing frequencies. Distribution relative to the work of 
the full configuration, W(S;i^) are visit ratios. Distributions with 
respect to the work of a continuing node, e.g., 

W(S?x*) 

W(S;X) 

are routing frequencies. Our interest in them will be that they 
characterize the way the work burden is distributed over subcon¬ 
figurations. That is, they are a basic device for characterizing 
workloads, taken separately. Contrariwise, relative powers are a 
characterization of the way work flow distributes over subconfigura¬ 
tions, taken separately. These observations are the basis for 
relating the propriety of a configuration for a given workload and 
vice-versa. 
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FUNDAMENTAL VARIABLES: STORAGE OCCUPANCY 


Storage occupancy is a spatial concept, a measure of the extent of 
storage associated with a software unit. The unit of measure is a 
container, in our case, typically, an 8-bit byte. This is both 
conventional and convenient for our purposes. To establish the 
association between software units and storage we again appeal to the 
complete instruction "trace." Associated with each instruction of a 
software unit is a set of containers and their addresses, typically 
derived from an initial address for the instruction and a known length 
in containers for the instruction. Likewise, associated with the 
operands are sets of containers and their addresses. The amount of 
storage occupancy is the count of the containers in the set which is 
the union of the sets of instruction and operands containers. This is 
the instruction or instantaneous storage occupancy. The storage 
occupancy of a software unit is the count of containers in the set 
formed by the union of the sets making up the instruction occupan¬ 
cies. Because the existence of the instructions and operands in the 
storages constitutes a realization of them, the notation for storage 
occupancy is R(S;G), where G is the collection of storages. 

Several observations should be made regarding this definition. 

First, because a software unit consists of instruction executions over 

some time interval, there is an implicit time interval associated with 

storage occupancy. The most natural one is from the first instruction 

execution to the last with respect to the system clock, i.e., 

Te(S;^). Another, of course, is the execution times of the various 

processors manipulating the storages such as Tx(S;*l0. Now, if the 

storages are partitioned it is the case that R(S;G) ■ £r(S;G 4 ) 

* 

where the G # are a partitioning of G. Typically, in practice, the 
partitioning would be into main storages and the storage associated 
with various drives. Then the total storage occupied by S is the sum 
of the main storage occupancy and that associated with all the 
relevant drives. 
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If the time interval involved is with respect to some higher 
clock, then we can discuss other software units over the same inter¬ 
val. It should be clear from the previous discussion that over a time 
two software units may, by turns, occupy some containers, even though 
the software units are disjoint . To form the composition of storage 
occupancies we must first form the union of the sets of containers and 
then take the count. However, this yields the same result as first 
taking the union of the software units and then measuring the storage 
occupancy, so that we have 

R(S;a) ■ t*{S # ;a}] * [{*S # ;a}] < £R(S*;cf) where S = US # 

* 

which may be a partition or not, and where [] indicates the measure 
function (count) of the set {}. 

The above definition differs from some common measures of storage 
occupancy. It is usual, thinking in terms of programs instead of 
software units, to consider the main storage required to contain the 
loaded program as its main storage occupancy even though some of its 
instructions are not executed in the consequent run of the program. 

In our definition, the unexecuted instructions would not appear in any 
software unit and hence would not contribute to the storage occupancy 
measure. If one feels compelled to identify the two notions then the 
instructions of the load process must be considered part of the soft¬ 
ware unit. Then everything else follows as required. Another common 
measure is the storage allotted or allocated to a program run. This 
amount of storage is a function of operating system policy and 
arbitrary human action. It has no necessary relationship to charac¬ 
teristics arising from the executing system itself and, therefore, 
cannot be treated in purely software physical terms. 
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CONCLUDING REMARKS 


Only the nuclear theory, the fundamental concepts and primary 
results, have been presented here. A considerable body of work based 
on software physics has been done already, chiefly by the staff and 
members of the Institute for Software Engineering, but it has not been 
generally published. 

In the purely theoretical area, Traister has developed the 
relationships to operational analysis and queueing theory which are 
merely suggested in this paper. In the area of applied theory much 
work has been done on workload forecasting, accounting and related 
financial issues and capacity analysis and planning, especially of I/O 
subconfigurations. 

In the engineering and "laboratory" areas powers of various CPU's 
have been measured by both software and hardware monitor techniques, 
as have a variety of peripheral devices and subsystems. Software 
physics has proved very valuable in software benchmarking, providing 
methods of comparing the efficiency of the algorithms themselves, 
distinct from the influence of hardware as well as the more conven¬ 
tional performance comparisons. 

Much remains to be done in extending the theory, in increasing the 
scope of applications of the theory and in measurement and engineer¬ 
ing. There are some very difficult problems in all three areas that 
have not yet even been approached. It is hoped that this paper will 
stimulate interest and encourage participation — for the more hands 
turned to the tasks, the more likely the successes. 
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